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Abstract — This paper investigates the construction of linear 
network codes for broadcasting a set of data packets to a 
number of users. The links from the source to the users are 
modeled as independent erasure channels. Users are allowed to 
inform the source node whether a packet is received correctly 
via feedback channels. In order to minimize the number of 
packet transmissions until all users have received all packets 
successfully, it is necessary that a data packet, if successfully 
received by a user, can increase the dimension of the vector 
space spanned by the encoding vectors he or she has received 
by one. Such an encoding vector is called innovative. To reduce 
decoding complexity, sparse encoding vectors are preferred, since 
the sparsity can be exploited when solving systems of linear 
equations. Generating a sparsest encoding vector with large finite 
field size, however, is shown to be NP-hard. An approximation 
algorithm is constructed. For binary field, heuristic algorithms 
are also proposed. 

Index Terms — Erasure broadcast channel, network coding, 
computational complexity. 



I. Introduction 

Broadcasting has been a challenging issue in telecommu- 
nications. The challenge mainly comes from how a trans- 
mitter can disseminate a common information content to all 
users/receivers reliably and efficiently via a broadcast channel 
which could be unstable and error-prone. More specifically, 
one of the ultimate goal of broadcasting is to provide a trans- 
mission scheme such that a common information content or a 
set of packets can be disseminated with minimum number of 
transmissions for a sender to complete the whole information 
content dissemination for all users. This measure is commonly 
called the completion time of a broadcast system. 

Several classical approaches provide heuristic solutions to 
the above issue. With user feedback, automatic repeat request 
(ARQ) offers reliable retransmissions for the erased packets 
due to channel impairments. However, such an approach 
becomes inefficient when the number of users increases, as 
the users may have entirely distinct needs for the erased 
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packets. Reliable broadcasting can also be achieved without 
user feedback by forward error correction. With the use of 
erasure codes, a user can reconstruct the entire set of original 
packets, provided that the number of erased packets is smaller 
than a certain threshold. However, the amount of packets that 
will be erased depends on the channel erasure probability, 
which is time-varying and hard to predict. That limits the use 
of erasure codes in broadcasting. To improve upon classical 
approaches, the approach of linear network coding JT], |f2] has 
been shown to be a promising solution 0-||6). 

The idea of linear network coding for broadcasting is that 
a transmitter broadcasts to K users encoded packets that are 
obtained by linear combining the N original packets over the 
finite field GF(q). An encoding vector specifies the coeffi- 
cients for the linear combination. An encoded packet together 
with a header which contains the corresponding encoding 
vector is broadcasted to all users. It is said to be innovative 
to a user if the corresponding encoding vector is not in the 
subspace spanned by the encoding vectors already received by 
that user. It is called innovative if it is innovative to all users 
who have not yet received enough packets for decoding. It is 
shown in Q that an innovative packet can always be found 
if q > K. Once a user receives any N innovative packets, 
he or she can decode the N original packets by Gauss-Jordan 
elimination. Therefore, the generation of innovative packets 
is vital. Clearly, if all the encoded packets are innovative, 
the completion time can be minimized. Such a network-coded 
broadcast scheme is clearly rate-optimal. 

Linear network codes for broadcasting can be generated 
with or without feedback. LT codes |8|. Raptor codes (|9] 
and random linear network codes (RLNC) [10] can be used 
without feedback. By suitably choosing design parameters, 
innovative packets can be generated by those coding schemes 
with high probability. LT codes and Raptor codes are generated 
by an optimized degree distribution. However, they are mainly 
designed for broadcasting a huge number of packets, and may 
not be good choices when the number of packets is only 
moderately large. With feedback, it is suggested in Q the use 
of Jaggi-Sanders algorithm [11], which is a general network 
code generation method and is able to find innovative encoding 
vectors for q > K. However, its encoding and decoding com- 
plexities are relatively high, as it is not specifically designed 
for the broadcast application. Therefore, some heuristics have 
been proposed lfl2l - lfT5l . It is suggested in [16| that encoded 
packets should be instantly decodable, in the sense that a new 
packet can be decoded once it is available at a receiver without 
waiting for the complete reception of the full set of packets. 
However, as an instantly decodable packet to all users may 
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not exist, the completion time is in general larger than that 
in a system without this extra requirement. With the idea 
of instantly decodability, some works focus on minimizing 
decoding delay, where a unit of decoding delay is defined as 
that an encoded packet is successfully received by a user but 
that packet is not innovative or not instantly decodable to him 
or her EHUS). 

The excellent performance of the linear network coding 
approaches to broadcast encourages researchers to consider 
its practicality. In fact, the decoding complexity of a linear 
network code is an important issue in practice. One possible 
way to reduce decoding complexity is to use sparse encoding 
vectors. This sparsity property is important, as it can be ex- 
ploited in the decoding process. For example, a fast algorithm 
by Wiedemann for solving a system of sparse linear equations 
can be used ll2p . If the Hamming weight of each encoding 
vector is at most w, the complexity for solving an N x N 
linear system can be reduced from 0(N 3 ) using Gaussian 
elimination to 0(wN 2 ). The Wiedemann algorithm is useful 
when N is large. When N is moderate, we can implement 
some sparse representation of matrices, so that even the usual 
Gaussian elimination is used, the number of additions and 
multiplications required can be reduced. For other fast methods 
for solving linear equations over finite field, we refer the 
readers to 11221. ||23l. 

Minimizing the completion time and reducing the decoding 
complexity are equally important in linear network code design 
for erasure broadcast channel. However, the innovativeness of 
encoding vectors together with their sparsity has not been 
thoroughly studied. Given the encoding vectors which have 
been received by the users in a broadcast system, how can a 
transmitter generate encoding vectors which are both sparse 
and innovative is a challenging problem. In this paper, we 
address the issue by developing a method called the Optimal 
Hitting Method and its approximation version, called the 
Greedy Hitting Method. Both of them are able to generate 
sparse and innovative encoding vectors for q > K. That 
results in a significant reduction in decoding complexity when 
compared with their non-sparse counterparts. Furthermore, 
based on the greedy hitting method, we develop a suboptimal 
procedure to improve the completion time performance for 
q = 2, where the existence of innovative encoding vectors is 
not guaranteed. Simulation results show that its performance 
is nearly optimal. 

The rest of this paper is organized as follows. We review 
the literature on complexity in network coding in Section HH In 
Section HTH the system model is introduced. In Section ITVl we 
consider the innovativeness issue. We characterize innovative 
encoding vectors by a linear algebra approach and prove that to 
determine whether there exists an innovative vector for q = 2 
is NP-complete. In Section[V] the sparsity issue is considered. 
After showing that If -sparse innovative vectors always exist if 
q > K, we investigate the problem of finding sparsest innova- 
tive vectors, which we call the SPARSITY problem. SPARSITY 
is proven to be NP-complete. In Section I VII we present 
a systematic way to solve SPARSITY using binary integer 
programming. A polynomial-time approximation algorithm is 
also constructed. In Section IVII1 our algorithms are compared 



with some other transmission schemes by simulations. Finally, 
conclusions are drawn in Section IVIIII 

II. Literature on Complexity Classes of Network 
Coding Problems 

A considerable amount of research has been done on the 
complexity issues in conventional coding theory (See the 
survey in [24 J for example). For instance, it is shown in [25] 
and [26] that the problems of finding the weight distribution 
and the minimum distance of linear codes are NP-hard. The 
complexity issues in network coding are less well understood. 

For linear network codes, Lehman and Lehman investigate 
the complexity of a class of network coding problems in [27 1, 
and proved that some of the problems are NP-complete. 
Construction of linear network codes using a technique called 
matrix completion is considered in |[28l . and the complexity 
class of the matrix completion problem is studied in ||29l . It 
is shown in ll30l that approximating the capacity of network 
coding is also a hard problem. 

To minimize encoding complexity, Langberg, Sprintson and 
Bruck divide the nodes in a general network topology into 
two classes. The nodes in one class forward packets without 
any coding while the nodes in another class perform network 
coding. The problem of minimizing the number of encoding 
nodes is shown to be NP-complete in [31 1, [32|. 

El Rouayheb, Chaudhry and Sprintson study the complexity 
of a related problem called index coding problem in ll33l . They 
consider the noiseless broadcast channel, and show that when 
the coefficient field is binary, the problem of minimizing the 
number of packet transmissions is NP-hard. A complementary 
version of the index coding is studied in 11341 . It is shown 
that the complementary index coding is NP-hard, and even 
obtaining an approximate solution is NP-hard. 

In (35], Milosavljevic et al. studies a related system. The 
users are interested in a common data file but only have partial 
knowledge of the file. By interactively sending data to each 
others through a noiseless broadcast channel, the users want to 
minimize the total amount of data sent through the channel. It 
is shown in ll35l that the optimal rate allocations can be found 
in polynomial time. 

In this paper, the problem setting is similar, except that the 
channel is modeled as an erasure broadcast channel, and we 
focus on the innovativeness and sparsity aspects of generating 
encoding vectors. 

III. System Model 

Consider a single-hop broadcast system, in which there are 
one transmitter and K users. The transmitter wants to send 
a file to all users via a broadcast channel, which is modeled 
as a discrete-time broadcast packet erasure channel. At each 
time, the transmitter broadcasts a packet, which is an element 
in GF(q). Each user either receives the packet successfully or 
experiences a packet loss. In other words, the output symbol 
of a user is either the same as the input symbol or an erasure 
symbol. The K output symbols of the users are assumed to 
be independent of one another. 
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In this paper, we focus on transmission schemes with linear 
network coding. The file is divided into N equal-size packets. 
Each packet transmitted by the transmitter is obtained by 
linearly combining the N packets, with coefficients drawn 
from GF(q). An TV-vector whose components are the N 
coefficients is said to be the encoding vector of that packet. A 
packet together with a header that contains the corresponding 
encoding vector is broadcast to K users. 

We assume that there is an error-free feedback channel 
from each user to the transmitter. Upon receiving a packet 
successfully, a user sends an acknowledgement (ACK) to the 
transmitter without delay or error. The transmitter keeps track 
of what each user has received. The next transmitted packet 
depends on all the previous ACKs from the K users. 

Given an encoding vector x = (xi, X2, • • • , £jv jH of dimen- 
sion N over GF(q), the support of x, denoted by supp(x), 
is the set of indices of the non-zero components in x, i.e., 

supp(x) = {i ; Xi 7^ 0}. 

The Hamming weight of x is defined as the cardinality of 
supp(x). An encoding vector that has Hamming weight less 
than or equal to w is said to be ly-sparse. 

Note that a transmitted packet is useful to a user if its 
corresponding encoding vector does not lie within the span 
of all previously received encoding vectors of that user. We 
say that such an encoding vector is innovative to that user. An 
encoding vector that is innovative to all users is simply said to 
be innovative. It is clear that a rate-optimal solution is to let 
the transmitter always broadcast innovative packets, provided 
that innovative encoding vectors always exist. 

IV. The Innovative Encoding Vector Problem 

Suppose that user k, for k — 1,2,..., K, has already 
received packets whose encoding vectors are linearly in- 
dependent. Let Cfc be the r/. x N encoding matrix of user k, 
whose rows are the encoding vectors. Without loss of 
generality, we assume that < N, for otherwise user k 
can decode the file successfully and can be omitted from our 
consideration. A vector x is innovative if it does not belong 
to the row space of Cfc for any k. The set of all innovative 
encoding vectors, I, is given by 
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1 = GF(q) N \ \J rowspace(C fe ). 



(1) 



k=l 



Example: Let q = 2, K = 2, N = 4 and n = 2. The two 
matrices Ci and C2 over GF{2) are given by 



Ci = 



110 
10 



, c 2 = 



10 1 
10 1 
11 



The row space of Ci consists of four vectors [0 0], 
[110 0], [0 1 0], and [111 0]. The row space of C 2 
consists of the eight vectors that have even Hamming weight. 

'Throughout this paper, all vectors are column vectors. For vector x, we 
use parentheses and commas when its components are listed horizontally, i.e., 
{x\,X2, ■ ■ - ,x N ). 



There are six innovative encoding vectors, and they form the 
set 

X = {(1,0, 0,0), (0,1, 0,0), (0,0, 0,1), 
(0,1, 1,1), (1,0, 1,1), (1,1, 0,1)}. 

It is well known that X is non-empty if the finite field size, 
q, is larger than or equal to the number of users, K Q. We 
repeat the simple proof below for the sake of completeness. 
We begin with a simple lemma, which will be used again in 
a later section. 

Lemma 1. Let A\, A2, ■ ■ ■ , Ak be finite subsets of a universal 
set IA. If K > 2 and A\ , A2 , ■ ■ ■ , Ak contain a common 
element, then 



\Ay U^ 2 U . . .UA K \ < \Ai\ + \A 2 



\A 



K 



Proof: Suppose x G A% for all i. Let A* be the set At \ 
{x} for i = 1, 2, . . . , K. By applying the union bound, 

\A\\JA* 2 \J...UA K \<\A\\ + \A* 2 \ + ... + \A* k \. 

This implies 

|AlUi 2 U...Uiii:|-l < |^i|-1 + |^ 2 |-1 + --. + |^a-|-1- 

As K > 2 by hypothesis, we obtain the inequality in the 
lemma. ■ 

Theorem 2 ( Q). If q > K, then I is non-empty. 

Proof: For k = 1, 2, . . . , K, let Vk be the row space of 
Cfc. The subspace Vk consists of the q Tk encoding vectors 
which are not innovative to user k. Obviously the zero vector 
is a common vector of these K subspaces. By Lemma Q] 
the union of these K subspaces contains strictly less than 
T^k=i Q Tk vectors. Since K < q, we have Y2k=i Q rk — 



Kq < q . Therefore there exists at least one encoding 
vector which is innovative to all users. ■ 
The condition q > K cannot be improved for any prime 
power q. The following example shows the non-existence of 
innovative encoding vector for the case where K = q + 1. 
Let U be the ambient space GF(q) N , and V be a subspace 
of U with dimension N — 2. Let Vi,.. . , \~n-2 be a basis 
of V. The q 2 cosets of V in U form the quotient space 
U/V, and is isomorphic to a vector space over GF(q). Let 
4>{u) : U — > U/V be the canonical mapping from U to the 
quotient vector space U/V. Because U /V has dimension 2 
over GF(q), we can find q + 1 vectors in U/V such that 
none of them is a scalar multiple of the others, i.e., they are 
pairwise linearly independent. As is a surjective mapping, 
we can choose vectors uj., U2, . . . , Ug+i in U such that, 
0(ui), 0(112), • • • , 0(u-q+i) are pairwise linearly independent 
in U/V. For k = 1,2, . . . , q+1, define C fe as the (N-l)xN 
matrix whose row vectors are vi, . . . , vn_ 2 , and Uk- The row 
spaces corresponding to the matrices C^'s satisfy 

(i) rank(C /£ ) = N - 1 for all k. 

(ii) rowspace(Ci) n rowspace(Cj) = V whenever i ^ j. 
(hi) For k = 1, 2, . . . , q + 1, the sets rowspace^^) \ V are 
mutually disjoint. 
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The number of elements in the union of the row spaces 



Cfc's is 



9+1 
k=l 



|rowspace(C fc ) \ V\ 



<1 



J N - 2 + { q + l){q N -' > 

JV 



Hence the rowspaces of C^'s cover the whole vector space 
GF(q) N . Any encoding vector we pick from GF(q) N is not 
innovative to at least one user. 

As an example, we consider the case q = 3 and K = 4 and 
N = 3. If the encoding matrices are 



"l 1 


f 




1 


1 


f 




"1 


1 


1" 




1 


1 


r 


1 










1 













1 







1 


2 



then we cannot find any innovative encoding vector. 

The set of innovative encoding vectors, I, can be charac- 
terized by the orthogonal complements of the row spaces of 
Cfe's, which is also known as the null spaces of C^'s. For 
k = 1, 2, . . . , K, let V/c be the row space of C^. Denote the 
orthogonal complement of Vk by V k r, 



± {v e GF(q) N 



x • v = for all x e Vk}, 



where x • v is the inner product of x and v. We will use the 
fact from linear algebra that a vector x is in \% if and only if 
x • v = for all v E V^. Let Bfe be an (N — r^) X N matrix 
whose rows form a basis of V k . To see whether a vector x 
is in Vk, it amounts to checking the condition B^x = 0; if 
BfcX = 0, then x G Vk, and vice versa. 

There are many different choices for the basis of the 
orthogonal complement V k . We can obtain one such choice 
via the reduced row-echelon form (RREF) of Suppose we 
have obtained the RREF of by elementary row operations. 
By appropriately permutating its columns, we can write it in 
the following form: 

[Ir,|A fe ]P fc) (2) 

where I rfc is the rk X identity matrix, is an r& x (N~rk) 
matrix over GF(q), and is an N x N permutation matrix. 
(Recall that a permutation matrix is a square zero-one matrix 
so that each column and each row contain exactly one "1".) 
We can take 



B t: = f-A£ II 



k \*-N-r k \ 



(3) 



The superscript T represents the transpose operator. It is 
straightforward to verify that the product of the matrix in d2j 
and B^ is a zero matrix. Hence, the rows of Bfe are orthogonal 
to the rows of Cfc, and form a basis of V k . 

In the appendix, we give another way of computing a basis, 
which is suitable for incremental processing. 

The following simple result characterizes the set of innova- 
tive encoding vectors, Z: 

Theorem 3. Given Ci, C2, ■ • ■ , Cjf, an encoding vector x 
belongs to I if and only if Bfe.x =/= for all k 's. 

Proof: If BfeX ^ 0, then x is not in Vk and therefore, is 
innovative to user k. It is innovative if Bfe.x 7^ for all fc's. 



Conversely, if Bj.x = for some k, then x is in Vk, and 
hence is not innovative to user k. Therefore, x ^ I. ■ 

When the underlying finite field size is small, innovative 
encoding vectors may not exist. For further investigation of 
the existence of innovative encoding vectors, we formulate 
the following decision problem: 

Problem: IEV, 

Instance: K matrices, Ci, C2, . . . , Ck, over GF(q), each 
of which has N columns. 

Question: Is there an A^-dimensional vector x over GF(q) 
which does not belong to the row space of Cfc for k = 
1,2,. ..,K. 

The following result shows that the decision problem is NP- 
complete for q = 2. 

Theorem 4. IEV2 is NP-complete. 

Proof: The idea is to reduce the 3-SAT problem, well- 
known to be NP-complete (36], to the IEV2 problem. Recall 
that the 3-SAT problem is a Boolean satisfiability problem, 
whose instance is a Boolean expression written in conjunctive 
normal form with three variables per clause (3-CNF), and the 
question is to decide if there is some assignment of TRUE and 
FALSE vaules to the variables such that the given Boolean 
expression has a TRUE value. 

Let E be a given Boolean expression with n variables 
xi, . . . , x n , and m clauses in 3-CNF. We want to reduce the 
3-SAT problem to the IE V2 problem with N = n + 1 packets 
and K = m + 1 users. 

For the i-th clause (i — 1,2,..., m), we first construct a 
3 x (n+ 1) matrix B^. If the j-th literal (j = 1, 2, 3) in the i-th 
clause is Xk, then let the fc-th component in the j-th row of B^ 
be one, and the other components be all zero. Otherwise, if 
the j-th literal in the i-th clause is ~^Xk, then let the fc-th and 
the (n + l)-st component in the j-th row of B^ be both one, 
and the remaining components be all zero. Let C,; be a matrix 
whose rows form a basis of the orthogonal complement of the 
row space of B^. We will use the fact that a vector v is in the 
row space of C; if and only if B^v = 0. 

Consider an example with n = 4 Boolean variables. From 
the clause -ixi V -1X2 V X3, we get 



B, 



1 1' 
10 1 
10 



1 
110 1 



It can be verified that each row in B^ is orthogonal to the rows 
in i.e., the row space of Ci is the orthogonal complement 
of the row space of Bi. 

For the extra user, user m + 1, let B m+ i be the 1 x (n + 1) 
matrix [0„1], where 0„ stands for the 1 x n all-zero vector. 
The problem reduction can be done in polynomial time. 

Let x = (x\,X2t ■ ■ ,x n ) be a Boolean vector and x 
: x. 1). Note that any solution x to a given 3-SAT problem 
instance would cause the product B^x a non-zero vector for 
j = 1,2, ... ,m and [0„l]x ^ 0. Therefore x is not in the 
row space of Cj for all j. Hence x is also a solution to the 
derived IEV2 problem. 

Conversely, any solution to the derived IEV2 problem also 
yields a solution to the original 3-SAT problem as well. Let 
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c = (ci, c%, . . . , c n , Cn+i) £ GF(2) n+1 be a solution to the 
derived IEV2 problem. Note that we must have c„+i = 1 
because of B m +i. Let i be an integer between 1 and m. Since 
c is not in the row space of Gj, the product B^c is a non-zero 
vector. Hence, if we assign TRUE to x k if Cfc = 1 and FALSE 
to Xk if Cfc = 0, for k — 1,2, ... ,n, then the i-th clause will 
have a TRUE value. Since this is true for all i, the whole 
Boolean expression also has a TRUE value. 

The problem IEV2 is clearly in NP, since it is efficiently 
verifiable. Hence it is NP-complete. ■ 

From the above result, we know that it is difficult to 
determine whether there exists an innovative vector for q = 2 
with a general K. Apart from the problem of existence of 
an innovative vector, it is also of interest in finding an N- 
dimensional encoding vector that is innovative to as many 
users as possible. We state the optimization problem as fol- 
lows: 

Problem: Max-IEV 9 

Instance: K matrices Cfc over GF(q), k = 1,2, ... ,K, 
and each matrix has N columns. 

Objective: Find an A^-dimensional vector x over GF(q) 
such that the number of users to whom x is innovative is 
maximized. 

The following result shows the hardness of finding approx- 
imate solution to Max-IEV^: 

Theorem 5. There is no approximation algorithm for Max- 
IEV 9 with an approximation guarantee of 1 — €m, assuming 
P NP, where cm is a positive constant. 

Proof: Given a Boolean expression in the 3-CNF form, 
the problem of maximizing the number of clauses that have 
TRUE values is commonly called the MAX-3-SAT problem. 
Consider the same reduction described in the proof of Theo- 
rem |4] It is clear that the number of clauses that have TRUE 
values under a given Boolean vector x is the same as the 
number of users to whom x = (x, 1) is innovative, excluding 
user m + 1. Therefore, the reduction is a gap-preserving 
reduction from MAX-3-SAT to Max-IEV 2 . The statement 
then follows from G2] Corollary 29.8]. ■ 

V. The Sparsity Problem 

Decoding complexity is one of the critical issues that 
could determine the practicality of linear network coding in 
broadcast erasure channels. One way to reduce the decoding 
complexity is to generate sparse encoding vectors and apply 
a decoding algorithm that exploits the sparsity of encoding 
vectors at receivers. In this section, we focus on the sparsity 
issues of innovative encoding vectors. 

A. Existence of K-sparse innovative vector 

In the previous section, it is found that innovative vectors 
always exist if q > K. In fact, we can prove a stronger 
statement that X-sparse innovative vectors always exist under 
the same condition. 

Lemma 6. For k = 1,2, K, let /fc(x) be a non-zero linear 
polynomials in L variables 

/ fc (x) = a kl xi + a k2 x 2 H h a kL x L , k = l,2,...,K, 



where the coefficients are elements in GF(q). If q > K, we 
can always find a vector x* = (xi,X2, ■ ■ ■ ,xl) S GF{q) L 
such that fk (x* ) =/= for all k. 

We first give a combinatorial proof, and then provide a 
deterministic algorithm which finds x*. 

Proof: For fc = 1,2, ... ,K, let 14 be the set of vectors 
x in GF(q) L satisfying /fe(x) = 0. The set Vk is a subspace 
of dimension L—l. By Lemma Q] the cardinality of the union 
of these K subspaces is strictly less than Kq L ~ x elements, 
which in turn is less than or equal to the cardinality of the 
whole space GF(q) L . Thus there exists at least one vector x* 
in GF{q) L such that / fe (x*) ^ for all k. ■ 

Now we give an algorithmic proof of Lemma [6] Let Si, 
where I = 1, . . . , L, be the index set such that k e Si if and 
only if a k i ^ 0. Since none of the linear polynomials /fe(x)'s 
are zero, we have [j^ =1 Si = {1,2, ... ,K}. We distinguish 
two cases: 

Case 1: \Si\ = K for some I. We can simply let x\ = 1 
and 2;* = for n ^ I. 

Case 2: \Si\ < K for all I. We assign values to the 
variables iteratively. Suppose we have already assign x* t 
to the variable xt, for t = 1,2, ... ,1 — 1. We note that 
fk(x*, . . . , act_i, Xt, 0, . . . , 0) = is a linear equation in 
a single variable xt, and thus have only one solution. As 
\Si \ < K < q, the number of elements in GF(q) which satisfy 
fk{x\, . . . , Xt_ 1 ,x t ,0, ■ ■ ■ , 0) = for some k 6 Si is strictly 
less than q. We can choose x\ such that 

fk{x\,x* 2 ,. . . , xl_ 1 ,xt,0, 0, . . . , 0) ^ 

for all k G Si. Upon termination, it is guaranteed that 
fk{x\,x* 2 , . . . ,x* L ) ^ for all k. We call this method 
the Sequential Assignment (SA) algorithm. Its computational 
complexity in terms of number of multiplications/divisions 
over GF(q) is analyzed as follows: In this algorithm, there 
are L iterations. In each iteration, we need to find an element 
in GF(q) that is not a root of any of these K equations. 
Consider the t-th iteration. For the fc-th equation, we need 
to compute a>k,t-iXt_ 1 and add it to the accumulated sum 
S*=i a kjXj, which is stored for the next iteration. The root 
of this equation can then be obtained by a division. Therefore, 
the total complexity of SA is 0(KL). 
Example: Let 

/i(x) =Xi + 2x 2 
/ 2 (x) = x 2 + 2x z 
/ 3 (x) = 2x x + x 3 

be K = 3 linear polynomials over GF(3). We apply the algo- 
rithm in Lemma [6] to find an assignment of x = (xi,X2,xs) 
such that /i(x), /a(x) and /3(x) are all non-zero. First of 
all, the three index sets are Si = {1,3}, S2 = {1,2}, 
and 53 = {2,3}. None of them has cardinality three. We 
proceed as described in the second case. We assign an arbitrary 
non-zero value to x\, say x\ = 1, and we can check that 
/x(l,0,0) = l, / 2 (1,0,0) = 0, / 3 (1,0,0) = 2. 
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Next, we want to find x 2 S GF(3) such that 

/i(l,x 2 ,0) = 1 + 2x2 7^0, and 
/ 2 (l,x 2 ,0) = x 2 ^0. 

It turns out that the only choice for x 2 is ^2 = 2. After a; 2 is 
fixed, we search for x 3 G GF(3) such that 

/ 2 (l,2,x 3 ) = 2 + 2x3^0 

/3(l,2,z 3 ) = 2 + ^3^0. 

The only choice for 2:3 is X3 = 0. Finally, we check that the 
values of fx, f 2 and / 3 evaluated at x = (1, 2, 0), 

/ 1 (1,2,0) = 2. / 3 (1,2,0) = 1, / 3 (1,2,0) = 2 

are all non-zero. 

By Lemma [6] we prove the following theorem: 

Theorem 7. If q > K, there exists a K-sparse encoding vector 
in X. 

Proof: For k — 1, 2, . . . , K, let be an arbitrary row 
vector in Bfe, and let nk be an arbitrary index such that the n k - 
th component of bfe. is non-zero. Form a new index set Af that 
consists of all nfc's. The cardinality of Af may be less than K 
since the rife's may not be distinct. Let bfe (A/") be a truncated 
vector of bfe, which consists of only the components of bfe 
whose indices are in J\f. Its dimension is equal to \Af\ < K. 

Now we show that there exists a vector x g I such that 
the z-th component of x is equal to zero if i Af. If the z-th 
component of x is zero for all i G - Af, then the inner product 
of bfe and x is the same as the inner product of bfe (Af) and 
x(Af)- According to Theorem|3] x is in 2 if b k (Af) -x(Af) ^ 
for all fc's. By Lemma|6l we can find such a vector x if q > K. 
Clearly, such a vector has Af as its support, and is hence K- 
sparse. ■ 

The above result shows that the minimum Hamming weight 
of innovative vectors is bounded above by K, assuming 
q > K. This upper bound cannot be further reduced as 
the following example shows: Consider a broadcast system 
of K users and N packets, where N > K. Suppose that 
user k has received a set of uncoded packets Ak- Here we 
regard Ak as a subset of {1,2,..., N}. Furthermore, suppose 
that the complement of the life's are mutually disjoint, i.e., 
Aj fl A c k = for j 7^ k. In such a scenario, an innovative 
packet must be a linear combination of at least K packets. 
For example, let N = 4 and K — 3. If the encoding matrices 
of the three users are 
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then an innovative encoding vector must have Hamming 
weight at least 3. For instance (1,1,1,0) and (1,1,0,1) are 
innovative, but no vector with Hamming weight 2 or less is 
innovative. 

B. Sparsest Innovative Vectors 

Theorem [7] shows that we can always find a A"-sparse 
innovative vector if q > K. It serves as an upper bound on the 
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minimum Hamming weight of innovative vectors. To further 
reduce the decoding complexity, it is natural to consider the 
issue of finding a sparsest innovative encoding vector for given 
Cfe's. In other words, we want to find a vector in I that has 
the minimum Hamming weight for the case where q > K. We 
call this algorithmic problem SPARSITY. We state its decision 
version formally as follows: 

Problem: Sparsity 

Instance: A positive integer n and K matrices with N 
columns, Ci, C 2 , . . . , Ck, over GF(q), where q > K. 

Question: Is there a vector x 6 X with Hamming weight 
less than or equal to nl 

Given all Cfe's, we can find a basis Bfc's of their corre- 
sponding null spaces by the method mentioned in Section JV] 
For k = 1, 2, . . . , K , let b^ - be the i-th row of Bfe. We define 



bfe 



/f=7 fc i 



(4) 



where V denotes the logical-OR operator applied component- 
wise to the N — r-fe vectors, with each non-zero component 
being regarded as a "1". In other words, the j-th component 
of bfe is one if and only if the j-th column of Bfe is nonzero. 
We define B as the K x N matrix whose fc-th row is equal to 
b^. Note that B is a binary matrix and has no zero rows. For 
a matrix A and a subset Af of the column indices of A, let 
A(Af) be the K x \Af\ submatrix of A, whose columns are 
chosen according to Af. We need the following lemma: 

Lemma 8. Let Af C {1,2,..., N} be an index set and q > K. 
There exists an encoding vector x = (xx, X2, ■ ■ ■ , Xn) G I 
over GF(q) with supp(x.) C Af if and only ifB(Af) has no 
zero rows. 

Proof: If B(A/") has no zero rows, then bfc(AT) 7^ for 
all fc's. Furthermore, for all fc's, there must exist hkj{Af) 7^ 
for some j. By Lemma [6] we can find x(A/") G GF(q)\ ' 
such that bkj{Af) ■ x(AT) 7^ for all fc's. Let the components 
of x whose indices do not belong to Af be zero. Then by 
Theorem [3] xgl. 

Conversely, if x is an innovative vector with x„ = for 
n Af, then B(A/") cannot have zero rows, for if row fc of 
B(A/") is a zero vector, then Bfe (Af) is a zero matrix and the 
fc-th inequality in Theorem [3] cannot hold. ■ 

The NP-completeness of SPARSITY can be established by 
reducing the hitting set problem, HittingSet, to SPARSITY. 
Recall that a problem instance of HittingSet consists of a 
collection c € of subsets of a finite set U. A hitting set for ^ is 
a subset of U such that it contains at least one element from 
each subset in % '. The decision version of this problem is to 
determine whether we can find a hitting set with cardinality 
less than or equal to a given value. 

Problem: HittingSet 

Instance: A finite set U, a collection ^ of subsets of U and 
an integer n. 

Question: Is there a subset S CW with cardinality less than 
or equal to n such that for each Cs^we have C D S 7^ 0? 
It is well known that HittingSet is NP-complete 
Example: Let U = {1, 2, 3, 4, 5}, 

^={{1,2,3},{2,3,4},{4,5}} 
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and n = 2. We can check that the set {1, 4} is a hitting set of 
size n = 2. 

Theorem 9. Sparsity is NP-complete. 

Proof: We are going to reduce HittingSet to an in- 
stance of Sparsity via Karp-reduction [38]]. Let the cardi- 
nality of U be N. Label the elements of U by 1, 2, . . . , N. 
We define = {Ci, C2, • • • , Ck}, where A is the num- 
ber of non-empty subsets in c 6 '. For k = 1,2, ...,K, 
form an A-vector bfc e GF(q) N with its i-th component 
equal to one if i is in Ck and zero otherwise, i.e., bfc 
is the characteristic vector of Ck- Note that bfc ^ and 

— {supp(h%), supp(b2), . . . , supp(hK)}- These bfc's cor- 
respond to the degenerate form of Bfc's in Theorem [3] with 
only one row in Bfc. Let Cfc be the encoding matrix of 
user k, whose row space is the null space of Bfc and X be 
the innovative vector set defined in £[). In other words, any 
instance of HittingSet can be represented as an instance of 
Sparsity in polynomial time. 

It remains to show that there exists a hitting set TL for c £ with 
\K\ < n if and only if there exists anxGl with Hamming 
weight \supp(x)\ < n. Given the bfc's obtained via the above 
reduction, suppose there exists x £ X with \supp(x)\ < n. 
By Theorem [3] we must have • x ^ for all fc's, which 
implies supp(hk)Dsupp(x.) ^ for all fc's. The set supp(x) is 
therefore a hitting set for the given instance. Conversely, given 
a hitting set TL for ^ with \TL\ < n, by definition supp(hk) H 
H / 8 for all fc's. Therefore, B('H) has no zero rows. By 
Lemma [S] there exists an x e GF(q) N such that supp(x.) C 
TL. Hence, \supp(x)\ < n. 

As Sparsity is verifiable in polynomial time, Sparsity 
is in NP. Hence it is NP-complete. ■ 

Now we define the optimization version of SPARSITY as 
follows: 

Problem: Min Sparsity 

Instance: A positive integer n and A matrices with N 
columns, Ci, C2, . . . , Ck, over GF(q), where q > A. 

Objective: Find a vector x G I with minimum Hamming 
weight. 

The minimum Hamming weight among all innovative vec- 
tors is called the sparsity number, and is denoted by u>. It 
is easy to see that if a polynomial-time algorithm can be 
found for solving the optimization version of SPARSITY, then 
that algorithm can be used for solving the decision version 
of Sparsity in polynomial time as well. Therefore, Min 
Sparsity is NP-hard. 

On the other hand, if A is held fixed, then there exist 
polynomial-time algorithms to solve MlN SPARSITY. It is 
proven in [39| and Section lV-Al that a A"-sparse vector exists in 
X, if q > A. By listing all vectors in GF(q) N with Hamming 
weight less than or equal to A, we can use Theorem|3]to check 
whether each of them is in X. For each A-sparse encoding vec- 
tor, we compute the matrix product BfcX for k = 1, 2, . . . , K. 
Each matrix product takes 0{NK) finite field operations. The 
total number of finite field operations for each candidate x 
is 0{NK 2 ). After checking all A-sparse encoding vectors, 
we can then find one with minimum Hamming weight. The 
number of non-zero vectors in GF{q) N with Hamming weight 



no more than A is equal to Ylk=x (*■) (<Z~l) fc - F° r fixed A and 
q, the summation is dominated by the largest term (^) (q— 1) A 
when N is large, which is of order 0(N K ). The brute- 
force method can solve the problem with time complexity of 
0(N K (NK 2 )). As A is held fixed, MlN SPARSITY can be 
solved in polynomial time in N. 

Let Min HittingSet be the minimization version of the 
hitting set problem, in which we want to find a hitting set 
with minimum cardinality. The next result shows that MlN 
Sparsity can be solved via Min HittingSet based on the 
concept of Levin-reduction l38l . 

Theorem 10. Min Sparsity can be reduced to MlN HlT- 
TINGSET via Levin- reduction. 

Proof: Given an instance of Min Sparsity, we deter- 
mine bfc as in (|4|i for k = 1, 2, . . . , A. Then we form the 
following instance of Min HittingSet: 

W = {1,2,...,JV}, 

^ = {supp(bi), supp{h 2 ), . . . , supp(h K )}. 

Let TL be a solution to the above instance. Then B(H) has no 
zero rows. By Lemma [H] there exists a vector x* 6 X over 
GF(q) with supp(x*) C %. Such a vector x* can be found 
by the SA algorithm in polynomial time. 

We claim that there does not exist x' S I with Hamming 
weight \supp(x')\ < \7i\, and thus \supp(x*)\ must equal 
\H\. Suppose there exists such a vector x'. Lemma [8] implies 
that B(swpp(x')) has no zero rows, which in turn implies 
that supp{x.') n swpp(bfc) ^ for all fc's. Then supp(x.') 
would be a hitting set with cardinality strictly less than \TL\. 
A contradiction. ■ 

VI. Network Coding Algorithms 

In this section, we present algorithms that generate sparse 
innovative encoding vectors for q > A. While for the binary 
transmission cases (q — 2), finding an innovative encoding 
vector may not always be possible, a modification of the 
algorithm is also proposed for handling such cases. 

A. The Optimal Hitting Method 

For q > A, we generate a sparest innovative vector in 
two steps. First we find an index set Af with minimum 
cardinality, which determines the support of the innovative 
encoding vector. This is accomplished by solving the hitting 
set problem. Once Af is found, the non-zero entries in the 
vector can be obtained by the SA algorithm. 

The hitting set problem can be exactly solved by binary 
integer programming (BIP), formulated as follows: 

oj = min V1+V2 + ■■■ + VN, 
y 

subject to 

By > 1, 



s 



where 



B 



bi 
b 2 



>K. 



is a K x N binary matrix, y = (2/1,2/2, ■ • ■ >Vn) is an N- 
dimensional binary vector, 1 is the if -dimensional all-one 
vector, and the inequality sign is applied component- wise. 

To solve the above problem, we can apply any algorithm for 
solving BIP in general, for example the cutting plane method. 
We refer the readers to [40] for more details on BIP. 

Example: Let q = 3, K = 3 and N = 4, and the orthogonal 
complements of V\, Vi and V3 be given respectively by the 
row spaces of 
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, B 2 = [0 2 1 0] 



B 



3 — 



The vectors b&, for k — 1, 2, 3, are 

bi = [1 1 1], b 2 = [0 1 1 0], b 3 = [1 1 1]. 
The corresponding instance of MlN HittingSet is: 

U = {1,2,3,4}, # = {{1,2,4}, {2,3}, {1,3, 4}}. 

The solution to both Min Sparsity and Min HittingSet 
can be obtained by solving the following BIP: 



min 2/1 + 2/2 + 2/3 + 2/4, 



subject to 



2/1 + 2/2 + 2/4 > 1, 2/2 + 2/3 > 1, 2/i + 2/3 + 2/4 > 1, 
2/1,2/2,2/3,2/4 G {0,1}- 

One optimal solution is yy = j/2 = 1 and 2/3=2/4 — 
0. That means, the sparsity number, lu, is equal to two and 
TV = {1,2}. Furthermore, according to Lemma [8] a 2-sparse 
innovative encoding vector can be found, for example, by the 
SA algorithm. 

We call the above procedure for generating an innovative 
vector with minimum Hamming weight the Optimal Hitting 
(OH) method. We summarize the algorithm as follows: 

The Optimal Hitting method (OH): 
Input: For k = 1, 2, . . . , K , full-rank r^y. N matrix over 
GF(q), where q > K and < r k < N. 
Output: x = (xi, X2, ■ ■ ■ , xn) £ 1- with minimum Hamming 
weight. 

Step 0: Initialize x as the zero vector. 

Step 1: For k — 1, 2, . . . , K, obtain a basis of the null space 
of Cfc. Let Bfc be the (N — rk) X N matrix over GF(q) whose 
j-th row is the j-th vector in the basis. 
Step 2: For k = 1, 2, . . . , K, let b^ be the component-wise 
logical-OR operations to the N — rj~ row vectors of Bfe. (Each 
non-zero component of B^ are being regarded as a "1" when 
taking the logical-OR operation.) 

Step 3: Solve the corresponding Min HittingSet as shown 
in Theorem [TOl and return H. 



Step 4: For k — 1, 2, . . . , K, choose a row vector from B&, 
say b^, such that supp(hk) fl TL 7^ 0. 

Step 5: Determine x(H) such that x(H) • h k (U) ^ for 
k = 1, 2, . . . , K, by the SA algorithm. 

Example continued: We solve the hitting set problem in 
Step 3 and obtain y = (1, 1, 0, 0). Hence, M = {1, 2}. In step 
4, we choose 

bi = (1, 2, 0, 1), b 2 = (0, 2, 1, 0) and b 3 = (1, 0, 0, 2). 

In Step 5, we obtain Si = {1, 3} and S 2 = {1, 2}. Note that 
both |<Si| and IS2I are not equal to 3. We next set x\ = 1, and 
choose X2 such that 

bi- (1,12,0,0)^0 

b 2 - (1,^,0,0)^0. 

We can choose 2^ = 2 to satisfy these two inequalities 
simultaneously. The vector x = (1,2,0,0) is an innovative 
encoding vector of minimum Hamming weight. 



B. The Greedy Hitting Method 

Step 2 in the OH method requires solving an NP-hard 
problem. Therefore, some computationally efficient heuristics 
should be considered in practice. It is well known that MlN 
HittingSet can be solved efficiently and approximately by 
the following greedy approach ATI : 

> Repeat until all sets of ^ are hit: 

- Pick the element that hits the largest number of sets 
that have not been hit yet. 
In Step 3 of the OH method, the above greedy algorithm can be 
used to find approximate solutions. We call this modification 
the Greedy Hitting (GH) method. 

Theorem 11. The GH method is an H n factor approximation 
algorithm for MlN Sparsity, where H n is the n-th harmonic 
number, defined as H(n) = X^fe=i \- 

Proof: It is well known that the hitting set problem is 
just a reformulation of the set covering problem. Therefore, 
the greedy algorithm is an H n factor approximation algorithm 
for Min HittingSet, as well as for the set covering problem 
||37l . As shown in TheoremfTOl MlN SPARSITY can be reduced 
to Min HittingSet, and the sparsity number is equal to the 
cardinality of the minimum hitting set. Hence, GH is also an 
H n factor approximation algorithm for MlN SPARSITY. ■ 
Now we analyze the computational complexity of the GH 
method. The computation of each B^ can be reduced to 
the computation of the RREF of Cfe, which takes 0(N 3 ) 
arithmetic operations. However, if the encoding vectors are 
w-sparse, we can adopt the dual-basis approach in obtaining 
Bfc as in the appendix, and guarantee that each B^ can be 
obtained in 0(uiN 2 ) times. The computational complexity 
of Step 1 is thus 0(loKN 2 ). Step 2 involves 0(KN 2 ) 
operations. If Step 3 is solved by the greedy algorithm, then it 
takes 0{KN 2 ) operations. Step 4 requires 0{K) operations. 
Step 5 involves the SA algorithm, which has a complexity of 
0(K\n\). Since \H\ < N, the overall complexity of GH is 
0(ujKN 2 ). 
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C. Solving Binary Equation Set for q = 2 

The last step of the GH method involves solving a set of 
linear inequalities over GF(q). However when q = 2, solving 
a linear inequality of the form /(x) ^ is equivalent to 
solving the linear equation /(x) = 1. Based on this fact, 
we now propose a procedure which is called Solving Binary 
Equation Set (SBES), which modifies the SA algorithm so 
that it is applicable to the case where q = 2. Note that the 
same idea can be applied to cases where q is a prime power 
satisfying 2 < q < K. 

The heuristic is as follows. We want to find x such that 
Ax = 1, where A is the coefficient matrix of the system of 
linear equations. The system may be inconsistent and has no 
solution. Nevertheless, we can disregard some equations and 
guarantee that at least rank(A) equations are satisfied. 

Solving Binary Equation Set Procedure (SBES): 

Input: A K x N matrix B over GF{2) and A/", where N C 
{1,2,. ..,7V}. 

Output: x = (xi, X2, ■ ■ ■ , xn) S GF(2) n with support in N. 

Step 0: Assign z = (z\, z^, ■ ■ ■ , Z|jv"|) as a zero vector. 
Step 1: Delete columns of B whose column indices are not in 
J\f. Augment the resulting matrix by adding a if-dimensional 
all-one column vector to the right-hand side. Let the resulting 
matrix be denoted by Q. 

Step 2: Compute the row echelon form (REF) of Q and call 
it Q'. 

Step 3: Delete all zero rows in Q' and any row in Q' if it has 
a single "1" in the (\Af \ + l)-th entry. The resulting matrix is 
called Q". Let the number of pivots in Q" be v, and let p\, 
P2,---p v be the column indices of the pivot in Q" listed in 
ascending order. 

Step 4: Execute elementary row operations in Q" so that Q" 

is transformed into its row-reduced echelon form. 

Step 5: Set the variables associated with the non-pivot 

columns to zero. For i = 1,2, ...,v, assign z Vi the value 

of the i-th entry of the last column in Q". 

Step 6: Assign values to the components of x such that 

x(A0 = z, and x t = if i g Af. 

When applying the GH method to the case where q = 2, we 
replace the SA algorithm in Step 5 of the GH method by the 
SBES procedure. We call this modification GH with SBES. 
Example: Consider q = 2, K = 4, TV = 5, U = {1, 3} and 



In Step 2, we compute the REF of Q 
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We extract the first and third rows of B and augment it by the 
all-one column vector, 1, 
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The last row is an all-zero row, representing a redundant 
equation. The second last row contains a one in the third 
component, and zero elsewhere, implying that the original 
system of linear equations cannot be solved. In order to get a 
heuristic solution, we relax the system by deleting that row, 
and obtain 



Q" 



We have pi = 1 and p 2 = 2 in Step 3. The matrix Q" is 
already in its RREF By Step 5, Zi = q"(l, 3) = 1 and z 2 = 
q"(2, 3) = 0. Finally, we set x\ = z\ = 1, and x 3 = z 2 = 0. 
So x = (1, 0, 0, 0, 0) and x is innovative to all except the last 
user. 

In this example, an innovative vector does exist, but GH 
with SBES fails to find it. The main reason is that the hitting 
set subproblem is formulated for the case where q> K. When 
q is small, there is no guarantee that a non-empty I must 
consist of a vector with support restricted in H. Indeed, we 
will skip the greedy hitting procedure and simply let H be the 
index set of all packets, i.e., {1,2,..., N}, then it is easy to 
check that with the same input, the SBES procedure returns 
x = (0, 0, 1, 1, 0), which is an innovative vector to all users. 
We call this modification Full Hitting (FH) with SBES, for 
the reason that the hitting set is chosen as the full index set 
of the packets. In general, FH with SBES produces encoding 
vectors that are innovative to more users than GH with SBES, 
at the expense of higher Hamming weights. 

The SBES procedure is indeed the Gauss-Jordan elimi- 
nation. So the total complexity is 0(NK 2 ). Note that the 
encoding vectors generated by GH with the SBES procedure 
may not be innovative when K > q = 2; however they are 
still innovative to a fraction of the K users. 

VII. Performance Evaluation 

In this section, we evaluate our proposed methods via simu- 
lations. We simulate a broadcast system in which a transmitter 
broadcasts N equal-size packets to K users via a erasure 
broadcast channel. The whole process is divided into two 
phases. The transmitter sends all packets one by one without 
network coding in the first phase. In the second phase, packets 
are encoded by the encoding vectors generated by the methods 
we concern and transmitted until all users received enough 
packets for recovering the N packets. The transmitter encodes 
packets based on the perfect feedback from the users. A user 
acknowledges a packet if the packet is received successfully 
and innovative to that user. The transmitter sets up K matrices 
Cfc, for k = 1,2..., K, where records the encoding 
vectors of the encoded packets that have been acknowledged 
by user k after those previous transmissions. Since the packets 
are uncoded in the first phase, each row of contains 
exactly one nonzero component before the retransmission. In 



10 



-B— MWVS 
-$— RLNC 
-* — GH with SBES 
-* — FH with SBES 
Rate-optimal system 




40 50 60 

Number of users 
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Fig. 2. The average delay for q = 2 and N = 32 



our simulations, each simulation points involves 3000 random 
realizations and we assume that N — 32 and P e = 0.3. 

First, we examine the completion time and average delay 
of a two-phase broadcast system with different encoding 
schemes. The completion time is defined, from a system 
perspective, as the average of total number of transmissions 
for the transmitter to complete an iV-packet transmission for 
all users. The average delay, from a user perspective, is defined 
as the individual completion time averaged over all users. 
We only consider the case where q = 2, since our proposed 
methods always generate innovative vectors for q > K and 
are rate-optimal. 

Figure Q] shows the completion time performance of the 
broadcast system with GH with SBES, FH with SBES, the 
maximum weight vertex search (MWVS) algorithm in [16], 
where MWVS generates instantly decodable packets, and 
binary random linear network code (RLNC). In the figure, 
we include the minimum completion time performance of a 
broadcast system in a erasure channel as a reference [5 |. We 



remark that the completion time curves of GH and FH with 
a large finite field size are exactly the same as the optimal 
curve shown in the figure. For q = 2, simulation results show 
that GH and FH (with SBES) always outperform RLNC and 
MWVS. More importantly, it is found that the completion time 
performance of FH with SBES is nearly optimal. 

Next, we consider the average delay performance. Accord- 
ing to information theory, the best we can do in a (single- 
user) erasure channel with erasure probability P e is to have 
iV/(l — P e ) w 45.7 transmissions on average. As mentioned 
before, GH and FH can achieve that limit when q > K, 
since they always generate innovative packets. Here we only 
consider the case where q = 2. In Figure |2] the horizontal 
line at 45.7 serves as a lower bound, which is not tight since 
innovative vectors do not always exist. Again, both GH and 
FH (with SBES) outperform RLNC and MWVS. 

Intuitively, the delay performance of an encoding scheme is 
related to its ability to generate innovative vectors. Thus it is 
meaningful to examine how often the users may receive a non- 
innovative packet generated by an encoding scheme before 
they receive a complete set of packets, so that we have a 
better understanding on its delay performance. We examine 
the average number of non-innovative packets received by a 
user for different schemes when q = 2 in Figure [3] We observe 
that due to the random nature of RLNC, the average number 
of received non-innovative packets per user does not depend 
on the number of users in the system. This phenomenon 
agrees with the result in Figure [2] where the average delay 
performance of RLNC is independent of the number of users 
as well. In Figure [3] it is found that an encoding scheme with 
SBES always generates fewer non-innovative packets. It can be 
interpreted that the idea of SBES is to attempt finding a binary 
vector that is innovative to as many users as possible. Figure[3] 
also shows that FH with SBES generates the least amount of 
non-innovative packets among all encoding schemes that we 
considered. Note that the hitting set procedure in GH is the 
source of sparsity of the encoding vectors. However it may 
limit the choices for finding an innovative vector or solutions 
to inequality sets in SBES. Therefore the performance of FH 
is better. 

Apart from the delay performance, we are also interested in 
the decoding complexity of different schemes. As the sparsity 
of encoding vectors may affect the decoding complexity, it is 
of interest to know how sparse the encoding vectors generated 
by different schemes could be. For RLNC with q = 2, since 
we generate each component of an encoding vector with equal 
probability of zero and one, the average Hamming weight 
is A^/2 = 16, which is not shown in Figure |4] For other 
encoding schemes, it can be observed from Figure [4] that GH 
and MWVS generate sparse encoding vectors whose average 
Hamming weight is less than or around 6. The results of GH 
can be attributed to the hitting set minimization. FH with 
SBES tries to find vectors that are innovative to as many users 
as possible, without explicit consideration of the Hamming 
weight. Therefore, it can achieve a better delay performance 
at the expense of less sparsity. 

Next we evaluate the decoding complexity in terms of the 
number of additions and multiplications performed. In our 
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simulations, an addition operation involving two non-zero 
operands is counted. A multiplication operation is counted 
when none of the two operands is 1 or 0. The decoding 
algorithms in most of the previous work are basically Gauss- 
Jordan elimination except the instantly decodable schemes in 
[16|, [19]. What we are proposing for decoding is also Gauss- 
Jordan elimination, but implemented for sparse matrix. Fig- 
ures |5] and [6] show the average total number of operations for 
all users in the system with different schemes when q = 2 and 
q = 101, respectively. In Figures |4] [5] and [6] we observe that, 
in general, encoding vectors with larger average Hamming 
weight results in higher decoding complexity. Figure |5] shows 
that when q = 101, GH yields sparse innovative encoding 
vectors that results in significant reduction in both the average 
total number of additions and multiplications when compared 
with RLNC. Although both GH and RLNC can offer delay- 
optimal performance, GH involves fewer decoding operations, 
hence is a preferable choice for the optimal delay performance. 
In Figure [6] we examine the number of additions involved 
for different schemes when q = 2. We observe that, with 
instantly decodable MWVS, a receiver enjoys a low decoding 
complexity at the expense of larger delay. Furthermore, it is 
found that GH requires significantly fewer computations than 
RLNC. Especially, the decoding computations involved in GH 
with SBES is only 40 percent of the decoding computations 
of RLNC, though GH with SBES outperform RLNC in delay 
performance. As a result, GH is a good choice in terms of 
both delay performance and decoding complexity for q = 2. 
However if delay performance is the major concern, FH 
with SBES, providing nearly optimal delay performance with 
moderate decoding complexity, should be a good option. In 
short, both GH and FH offer promising choices for the trade- 
off between delay performance and decoding complexity in an 
erasure broadcast system. 

VIII. Conclusions 

In this paper, we adopt the computational approach to study 
the linear network code design problem for wireless broadcast 
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systems. To minimize the completion time or to maximize 
the information rate, the concept of innovativeness plays an 
important role. While it is well known that innovative encoding 
vectors always exist when the finite field size, q, is greater 
than the number of users, K, we prove that the problem 
of determining their existence over the binary field is NP- 
complete. Its corresponding maximization version not only is 
hard to solve, but also is hard to approximate. Nevertheless, we 
propose a heuristic called FH with SBES, which is numerical 
shown to be nearly optimal under our simulation settings. 

Sparsity of a network code is another issue we have ad- 
dressed. When q > K, we show that the minimum Hamming 
weight within the set of innovative vectors is bounded above 
by K. To find a vector that achieves the minimum weight, 
however, is proven to be NP-hard via the reduction from 
the hitting set problem. An exact algorithm based on BIP 
is described, and a polynomial-time approximation algorithm 
based on the greedy approach is constructed. 

The performance of our proposed algorithms has been 
evaluated by simulations. When q > K, our proposed algo- 
rithm is rate-optimal and is effective in reducing decoding 
complexity. When q = 2, our proposed algorithms are able to 
strike a proper balance between completion time and decoding 
complexity. 

Acknowledgements: The authors would like to thank Prof. 
Wai Ho Mow and Dr. Kin-Kwong Leung for their stimulating 
discussions. 

Appendix 

In this appendix, we illustrate how to compute a basis of the 
null space incrementally. In the application to the broadcast 
system we consider in this paper, the rows of C are given 
one by one. A row is revealed after an innovative packet is 
received. Given an r x N matrix C over GF(q), recall that 
our objective is to find a basis for the null space of C. The 
idea is as follows. We first extend C to a non-singular N x N 
matrix by appending N — r row vectors. Let the resulting 
square matrix be C. Let B be the inverse of C. By the very 
definition of matrix inverse, the last N — r columns of B is a 
basis for the null space of C. 

We proceed by induction. The algorithm is initialized by 
setting C = B = Iy. We will maintain the property that 
C^B. 

Suppose that the first r rows of C are the encoding vectors 
received by a user, and C = B — 1 . We let cj be the i-th row of 
C and hj be the j-th column of B. When a packet arrives, we 
can check whether it is innovative by taking the inner product 
of the encoding vector of the new packet, say w, with b r+ i, 
b r+ 2, . . . ,t>7v. According to Theorem [3] it is not innovative 
if and only if all such inner products are zero. 

Consider the case that w is innovative. Permute the columns 
of B, if necessary, to ensure that w T b r+ i ^ 0. This can 
always be done, since w cannot be orthogonal to all the last 
N — r columns of B. Permute the rows of C accordingly, so 
as to ensure that C _1 = B. Then, we modify C by updating 
its (r + l)-st row to w T . This operation can be expressed 
algebraically by 

C i — C + e r+ i(w - c r+ i) T , 



where e r+ i is the column vector with the (r+l)-st component 
equal to 1 and zero otherwise. The matrix e r+ i(w — c r+ i) T is 
a rank-one matrix, with the (r+l)-st row equal to (w— c r+ i) T , 
and zero everywhere else. The inverse of C+e r+ i(w— c. r+ i) T 
can be computed efficiently by the Sherman-Morrison for- 
mula E2 E3] p.18], 

(C + e r+1 (w-c r+1 ) T )- 1 
= £_ a _ C- 1 e r+1 (w-c r+1 ) T C- 1 

1 + (w - CV+i^C^er+i 

_ £_i _ b r+ i(w - c r+ i) T C~ 1 
w T b r +i 

= fr ..>c'fr'-^i (6) 

w J b r+ i 

We have used the facts that C _1 e r +i = b r +i and c^ +1 C _1 = 
e^ +1 in the above equations. The denominator of the fraction 
in © is a non-zero scalar by construction, so that division of 
zero would not occur. We update B by the expression in ©. 
Note that if w is w-sparse, the multiplication of w T and C _1 
in (|6]l can be done in 0(ujN) times. 
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